W203 Lab 1: Hypothesis Testing

Import Packages

Part 1 Foundational Exercises -- Applied Practice

1a sample answer. Find type 1 error

1b sample answer. Calculate Statistical Power

2 sample answer.

A paired t-test specifically tests for the mean difference between two metric variables. Because the paired sample in this case is ordinal, calculating the distance/difference between values on a Likert Scale would result in a non-sensical value. Specifically, the interval between two adjacent points on the scale may not be equal to the interval between a different pair of adjacent points on the scale, therefore defining an ordinal rather than metric value. Taking a mean/stdev amongst these categorical differences means we will pass non-sensical values into our computed test statistic. This would render results ultimately uninterpretable.

Part 1 Foundational Exercises -- Test Assumptions

1 sample answer.

two-sample t-test assumptions:

  1. independent samples
  2. metric data
  3. population is approximately normal, unless sample size is large such that CLT applies
  4. similar variances

Evaluation of Assumptions:

Missing context: are the intervals along the Cantril scale assumed to be equal

2 sample answer.

Wilcoxon Rank Sum Test Assumptions:

Evaluation of Assumptions

3 sample answer.

Assumptions for Wilcoxon Signed Rank:

Evaluation of Assumptions:

Missing Context: alcohol and liver death rates should be measured on the same scale for their differences to be valid (per x number of people for example)

4a sample answer.

Assumptions for paired t test:

Evaluation of Assumptions

Missing Context: Were respondants/samples gathered in a randomized manner

Part 2 Statistical Analysis

Tres initial analysis of suitable variables for identifying R vs D, and target variables

Hypothesis Test Selection:

Our goal is to evaluate whether democratic or republican voters experience more voting difficulty. Upon analyzing the data, our reponse variable Voting Difficulty is an ordinal variable measured on a Likert Scale from 1-5 where 1 signifies less difficulty while 5 signifies more difficulty. Ordinal data results in the need for a non-parametric test. Additionally, we are comparing between two distinct groups without a natural pairing, so a paired test is ruled out. The samples here are independent as the perception of voting difficulty for one respondent does not inform on another's response. Finally, upon evaluating the response variable density across party lines, voting difficulty has a similar distribution in both cases. Therefore, the characteristics listed above meet the assumptions for a Wilcoxon Rank Sum Test.